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The Institute of Statistical Mathematics 

We present a new class of interacting Markov chain Monte Carlo 
algorithms for solving numerically discrete-time measure- valued equa- 
tions. The associated stochastic processes belong to the class of self- 
interacting Markov chains. In contrast to traditional Markov chains, 
their time evolutions depend on the occupation measure of their past 
values. This general methodology allows us to provide a natural way 
to sample from a sequence of target probability measures of increasing 
complexity. We develop an original theoretical analysis to analyze the 
behavior of these iterative algorithms which relies on measure- valued 
processes and semigroup techniques. We establish a variety of con- 
vergence results including exponential estimates and a uniform con- 
vergence theorem with respect to the number of target distributions. 
We also illustrate these algorithms in the context of Feynman-Kac 
distribution flows. 

1. Introduction. 

1.1. Nonlinear measure-valued processes. Let (S^\S^)i>q be a sequence 
of measurable spaces. For every !>0we denote by 'P(S'O) the set of proba- 
bility measures on . Suppose we have a sequence of probability measures 
ttW g 'P(S^) where 7P°) is known and we have for I > 1 the following non- 
linear measure- valued equations 

(1.1) vrO = $;( 7r ('- 1 )) 
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for some mappings $j:7>(5( , - 1 >)->-7>(S( , >). Except in some particular situ- 
ations, these measure-valued equations do not admit an analytic solution. 

Being able to solve these equations numerically has numerous applications 
in nonlinear filtering, global optimization, Bayesian statistics and physics as 
it would allow us to approximate any sequence of fixed "target" probabil- 
ity distributions (tc"')i>q. For example, in a nonlinear filtering framework 
corresponds to the posterior distribution of the state of an unobserved 
dynamic model at time I given the observations collected from time to 
time I. In an optimization framework, tt^ could correspond to a sequence of 
annealed versions of a distribution ir that we are interested in maximizing. 
In both cases, <I>; is a Feynman-Kac transformation [5]. 

In recent years, there has been considerable interest in the development of 
interacting particle interpretations of measure- valued equations of the form 

(1.1) which we briefly review here. 

1.2. Interacting particle methods. The central idea of interacting par- 
ticle methods is to construct a Markov chain X® = (X p )i< p <n taking 
values in the product spaces (S^) N so that the empirical measure ir^J := 
■h^Z v= iS Y a) approximates ir® as N 700. In the simpler version, we con- 

struct inductively X® = (X p ^)i< p <n by sampling N independent random 
variables with common law ^i(tt^ ^). The rationale behind this is that 
the resulting particle measure n$ should be a good approximation of 
as long as ir^ ^ is a good approximation of 7r(' -1 ). More formally, X® is 
an (S^) N - valued Markov chain with elementary transitions given by the 
following formula: 

N ( 1 \ 

(1.2) F((x[ l \...,X < ff)edx\X^) = H^ l - £ 8 xtl) )(dx p ), 

p=l ^ l<q<N q ' 

where dx = d(x\, . . . ,xn) = dx\ x • • • x dxjy stands for an infinitesimal neigh- 
borhood of a point in the product space 

For Feynman-Kac transformations, these interacting particle models have 
been extensively studied and they are sometimes referred to as sequential 
Monte Carlo methods, particle filters and population Monte Carlo methods; 
see [5, 8] for a review of the literature. In this context, the convergence anal- 
ysis of these particle algorithms is now well understood. A variety of theoret- 
ical results are available, including sharp propagations of chaos properties, 
fluctuations and large deviations theorems, as well as uniform convergence 
results with respect to the level index /. 

These interacting particle methods suffer from two serious limitations. 
First, when the mapping §1 is complex, it may be impossible to generate 



INTERACTING MARKOV CHAIN MONTE CARLO METHODS 



3 



independent draws from it. Second, it is typically impossible to determine 
beforehand the number of particles necessary to achieve a fixed precision for 
a given application and users usually have to perform multiple runs for an 
increasing number of particles until stabilization of the Monte Carlo esti- 
mates is observed. Markov chain Monte Carlo (MCMC) methods appear as 
a natural way to solve these two problems [12]. However, standard MCMC 
methods do not apply in this context as we have a sequence of target distri- 
butions defined on different spaces and the normalizing constants of these 
distributions are typically unknown. 

1.3. Self-interacting Markov chains. We propose here a new class of 
interacting MCMC methods (i-MCMC) to solve these nonlinear measure- 
valued equations numerically. These i-MCMC methods can be described as 
adaptive and dynamic simulation algorithms which take advantage of the 
information carried by the past history to increase the quality of the next 
sequence of samples. Moreover, in contrast to interacting particle methods, 
these stochastic algorithms can increase the precision and performance of 
the numerical approximations iteratively. 

The origins of i-MCMC methods can be traced back to a pair of articles 
[6, 7] presented by the first author in collaboration with Laurent Miclo. 
These studies are concerned with biology-inspired self-interacting Markov 
chain (SIMC) models with applications to genetic type algorithms involving 
a competition between a reinforcement mechanism and a potential function 
[6, 7]. These ideas have been extended to the MCMC methodology in the 
joint articles of the authors with Christophe Andrieu and Ajay Jasra [1], as 
well as in the more recent article of the authors with Anthony Brockwell 
[4]. Related ideas have also appeared in computational chemistry [10] and 
statistics [9]. 

In the present article, we design a new general class of i-MCMC methods. 
Roughly speaking, these algorithms proceed as follows. At level 1 = we run 
an MCMC algorithm to obtain a chain = (Xn^) n >o targeting ir^ . Note 
that here the "time" index n corresponds to the number of iterations of the 
i-MCMC algorithm. We use the occupation measure of the chain at 
time n judiciously to design a second MCMC algorithm to generate X^ = 
(Xn^) n >o at level 1 targeting vr^ 1 ) which is typically more complex than 
7r^ -'. More precisely, the elementary transition Xn^ X^\ of the chain 
X^ at time n depends on the occupation measure 

Similarly we use the empirical measure of X^ l ~^ at level I — 1 to "feed" an 
MCMC algorithm generating X® targeting 7r® at level I. These i-MCMC 
samplers are SIMC in reference to the fact that the complete Markov chain 
A 7 ^ := (A^) <K m , associated with a fixed series of m levels evolves with 
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elementary transitions X n ~^ X n+l that depend on the occupation measure 
of the whole system from time up to time n. 

From the pure mathematical point of view, the convergence analysis of 
SIMC is essentially based on the study of the stability properties of sophis- 
ticated Markov chains with elementary transitions depending in a nonlinear 
way on the occupation measure of the chains. Hence the theoretical analysis 
of SIMC is much more involved than the one of traditional Markov chains. 
It also differs significantly from interacting particle methods developed in 
[5]. Besides the introduction of a new methodology, our main contribution is 
a refined theoretical analysis based on measure-valued processes and semi- 
group methods to analyze their asymptotic behavior as the time index n 
tends to infinity. 

The rest of the paper is organized as follows: 

The main notation used in this work are introduced in a brief prelimi- 
nary Section 1.4. The i-MCMC methodology is detailed formally in Section 
1.5. The main results of the article are presented in Section 1.6. Several 
examples of i-MCMC methods are provided in Section 2. This section also 
provides a discussion on how to combine interacting particle methods with 
i-MCMC methods. Section 3 is concerned with the asymptotic behavior of 
an abstract class of time inhomogeneous Markov chains. In Section 3.2, we 
present a preliminary resolvent analysis to estimate the regularity properties 
of Poisson operator and invariant measure type mappings. In Section 3.3, 
we apply these results to study the law of large numbers and the concen- 
tration properties of time inhomogeneous Markov chains. In Section 4 we 
discuss the regularity properties of a sequence of time averaged semigroups 
on distribution flow state spaces. The asymptotic analysis of i-MCMC meth- 
ods is discussed in Section 5. The strong law of large numbers is presented 
in Section 5.2. We also provide an L r -mean error bound for the occupa- 
tion measures of the i-MCMC algorithms at each level /. In Section 5.3, we 
discuss the long time behavior of these stochastic models in terms of the 
exponential stability properties of a time averaged type semigroup associ- 
ated with the sequence of target measures. We prove a uniform convergence 
theorem with respect to the level index I. The asymptotic analysis of the oc- 
cupation measures associated with the complete self-interacting model on a 
fixed series of levels is discussed in Section 6. The L r -mean error bounds and 
the concentration analysis are presented, respectively, in Sections 6.1 and in 
6.2. The final section, Section 7, is concerned with contraction properties of 
time averaged Feynman-Kac distribution flows. 

1.4. Notation and conventions. For the convenience of the reader we 
have collected some of the main notation used in the article. We also recall 
some regularity properties of integral operators used further in the article. 
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We denote, respectively, by M(E), M {E), V(E) and B(E), the set of all 
finite signed measures on some measurable space (E,£), the convex subset of 
measures with null mass, the set of all probability measures, and the Banach 
space of all bounded and measurable functions / on E. We equip B(E) with 
the uniform norm ||/|| = sup x£E \f(x)\. We also denote by B\(E) C B(E) the 
unit ball of functions / G B(E) with ||/|| < 1, and by Osc\(E), the convex 
set of ^-measurable functions / with oscillations less than one; that is, 

osc(/) = sup{|/(x) - f(y)\;x, y G E} < 1. 

We let /u(/) = f fi(dx)f(x) be the Lebesgue integral of a function / G B(E), 
with respect to a measure [i G Ai(E). We slightly abuse the notation and 
sometimes denote by ll(A) = //(1a) the measure of a measurable subset A G 
£. 

Let M(x, dy) be a kernel from a measurable space (E, £) into a measurable 
space (F,F) of the bounded integral operator / \- > M(f) from B(F) into 
B(E) such that the functions 

M(f)(x)= [ M(x,dy)f(y)eR 
Jf 

are ^-measurable and bounded, for any / G B(F). Such a kernel also gener- 
ates a dual operator fx \— > liM from A4(E) into M(F) defined by (/jM)(/) := 

We denote by ||M|| := supj-g^^) ||M(/)|| the norm of the operator / i-> 
M(/) and we equip the Banach space A^(i?) with the corresponding total 
variation norm = supj eBl ( E )|/x(/)|. Using this slightly abusive notation, 
we have 

||M||:=sup sup \5 x M(f)\ =sup||<5 x M||, 
xeEfeBi(F) x&E 

where S x stands for the Dirac measure at the point x G E. We recall that 
the norm of any kernel M with null mass M(l) = satisfies 

||M||= sup ||M(/)||=2 sup ||M(/)||. 
/efii(F) /eOsd(F) 

When M has a constant mass, that is, M(l)(x) = M(l)(y) for any (x,y) G 
E 2 , the operator ll h- >• /iM maps A^o(-E') into A^oC-^ 1 )- I n this situation, we 
let f3(M) be the Dobrushin coefficient of a kernel M defined by the following 
formula: 

p(M) := sup {osc(M (/)); / G Os Cl (F)}. 

By construction, we have M(/)//3(M) G Osc\(E) as soon as (3(M) ^ 0, so 
that 

||MM||=2 sup |/xM(/)|</3(M)2 sup |/z(/)| 
/eOsd(F) /eOsd(E) 



||^M||</3(M)||^ 
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Using the fact that \\5 X — S y \\ = 2 for x^y and 

P(M)= sup sup \{8 X M - S y M)(f)\ = sup 



\S x M-S y M\ 



/GOsc 1 (F)(x, 2 /)G£ 2 (x,y)£E 2 \\$x ~ 8y\\ 

we prove that 

P(M)= sup l!^p = i sup \\5 x M-5 y M\\ 

is also the norm of the kernel 

H£M (E) t-> liM£M (F). 

That is, we have 

P(M)= sup (H/iM||/||/x||). 

fieMo(E) 

More generally, for every kernel iT from a measurable space (£",£') into an 
measurable space (E,£), with null mass -ftT(l) =0, we have 

\\KM\\ = sup\\(5 x K)M\\<l3(M) sup\\{5 x K)\\ =>- ||ifM|| < /3(M)||K||. 

a;G£' xG-E' 

Unless otherwise stated, we use the letter C to denote a universal constant 
whose value may vary from line to line. Finally, we shall use the conventions 
£ = O and n = 1 - 

1.5. Interacting Markov chain Monte Carlo methods. We describe here 
the i-MCMC methodology to numerically solve (1.1). We consider a Markov 
transition from into itself and a collection of Markov transitions 

M$ from into itself, indexed by the parameter I > and the set of 
probability measures \x G 7 ? (5'^ _1 - ) ). We further assume that the invariant 
measure of each operator is given by <&z(/x); that is, we have 

v/ > 0, e v(s^) $,(m)m« = $,(/*)■ 

For Z = 0, we use the convention ^o^" 1 )) = tt^ and = M(°\ For 

every I <m, we denote by rj® E P(S' (Z) ) the image measure of a measure 

p(n <K m . s(0 ) ° n the ^ th level 

space We also fix a sequence of 
probability measures on 5( fc ), with fe>0. 

We let := (X^ 0) ) n > be a Markov chain on with initial distribu- 
tion uo and Markov transitions M^°\ For every /c > 1, given a realization of 
the chain X^" 1 ) := (X^" 1} ) n >0) the /cth level chain Jf^ is a Markov chain 
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(k) 

with initial distribution and with random Markov transitions M 

Vn 

depending on the current occupation measures r]n ^ of the chain at level 
(k — 1); that is, we have 

(1.3) G dx\X^\xW) = M% 1) (X*,dx) 

with 



:= 

n + - 

p=0 



1 ™ 



(k) 

The rationale behind this is that the Arfch level chain Xn behaves asymp- 

(k) 

totically as a Markov chain with time homogeneous transitions M as 

long as T]n ^ is a good approximation of 7r^ _1 \ 

In the special case where (x k , •) = <3?fc (//) , the Arth level chain {X^ ) n >i 
is a collection of conditionally independent random variables with distribu- 
tions (^kiv^-i ))n>l! that is, we have 



pji 



(1.5) V*:=0r o ,...,x m )G£ m ^ m )(x,^)= J] Mj!,,^,^ 1 ), 



(1.4) P((xf\...,X( fc ))Gdx|X(^)) = n$ fc (l £ 

p=l ^ 0<<j<P 9 ' 

where dx = d(x\, . . . , x n ) = dx\ x • • • x ofcc n stands for an infinitesimal neigh- 
borhood of a generic path sequence (xi, . . . ,x n ) G (S^) n . 

We end this section with a SIMC interpretation of the stochastic algorithm 
discussed above. We consider the product space 

E m := 5 (0) x • • • x S {m) 

and we let {K^ m) 

)r)eV(E m ) De the collection of Markov transitions from E m 
into itself given by 

0<l<m 

where dy := dy° x • • • x dy m stands for for an infinitesimal neighborhood of a 
generic point y := (y°, . . . ,y m ) G E m , and rj"' G V(S^) stands for the image 
measure of a measure rj G V(E m ) on the Ith level space , with m> I. In 
other words, rj® is the Ith. marginal of the measure i]. In this notation, we 
can readily check that 

T n -=(x^ x^) 

is an £? m -valued SIMC with elementary transitions defined by 

— 1 n 

(1.6) nxZ l +1 edy\rr) = K { ™2 ] (X™,dy) with rj™ = — - £ , 

vii n + 1 ^-^ n 

where F£ stands for the filtration generated by X . 
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1.6. Statement of some results. We further assume that the mappings 
:"P(S^ -1 )) — > V(S^) satisfy the following regularity condition for any 
I > 1 and any pair of measures (/J,, v) €V(S^ l ~ 1 ^) 2 

VZ>0,V/e£(S^) 

(1.7) 

|[*i(M)-*i(")](/)l< / Wv-vmW'dg) 



for some kernel from B(S®) into B(S^), with 

/ rK/,d<?)N|<Az||/|| and A/<oo. 

We also suppose that there exist some integer ni > and some constant q 
such that we have 

WMf-M^W^dWii-vW and 6,(n,) := sup /3((M«) n; ) < 1. 

^(SO-i)) 

(1.8) 

This pair of abstract regularity conditions are rather standard. The first 
one (1.7) is a natural Lipschitz property on the weakly continuous integral 
mappings 

v/ g 5(5®) /i e p(s ( '- 1} ) ^ g R. 

Roughly speaking, this weak Lipschitz property simply expresses the fact 
that &i(fi)(f) only depends on integrals of functions with respect to the 
reference measure [i. This condition is clearly satisfied for linear Markov 
semigroups 3>z(/i) = fiKi associated with some Markov transition K\. We 
shall discuss this condition in the context of nonlinear Feynman-Kac type 
semigroups (2.1) in Section 2.1. 

In the special case where M$\x l , •) = ^i(fi), the second condition (1.8) is 
trivially met for n\ = \ with b[(ni) = 0. In this particular situation, the first 
Lipschitz property of the mapping «$z(/i) takes the following form: 

\\<$>i(n) <q||m-^||- 

For more general models, condition (1.8) expresses the fact that the Markov 

transitions are strongly continuous and they satisfy Dobrushin's mix- 
ing condition, uniformly with respect to fi. We shall discuss this regularity 
condition in the context of Metropolis-Hastings type algorithms (2.7) in 
Section 2.2. 

Under the conditions (1.8), for every rj £V(E m ), the invariant measure 
(??) £ V(E rn ) of K^ m) defined in (1.5) is given by the tensor product 
measure 

(1.9) u> K i m) ( V ) = tt {0) ® $i(r? (0) ) ® • ■ • ® $ m {v {m - X) )- 
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We observe that the tensor product measure 
(1.10) 7f [m] := ^ (0) ® • • • ® ^ (m) 

is a fixed point of the mapping u ( m ) : i] G V(E m ) — > oj ( m ) (ry) G V{E m ). 
Using this notation, our main results are basically as follows. 

Theorem 1.1. For any r > 1, m>l, and any function f G B(E m ) we 
have 

S u Pv ^E(|r?H(/)-7fH(/)|'-)<oo. 

n>l 

Under some additional regularity conditions, we have the exponential in- 
equality 

Vt > limsup-IogPd^H _ ¥ H](/)| >t )< 

for some finite constant a m < oo as well as the following uniform conver- 
gence estimate: 

supsupn^Ed^CA) - vr( fc )(/ fc )| r ) < oo 

fc>0 n>l 

for some parameter a G (0, 1] and for any collection of functions (fk)k>o £ 

iL>o£i(s (fe) )- 

We end this introduction with a series of comments and open research 
questions. 

First, the mean error bounds and the exponential estimates presented 
above suggest the existence of Gaussian fluctuations of the occupation mea- 
sures fjn around their limiting value 7f^ m \ with a fluctuation rate y/n. We 
have recently studied these fluctuations in [2, 3]. 

It might be surprising that the decays to equilibrium presented in The- 
orem 1.1 differ from the three types of decays exhibited in [6, 7]. To un- 
derstand the main differences between these classes of interacting processes, 
we recall that the decay rate to equilibrium often depends on the contrac- 
tion coefficient of the invariant measure mapping associated with a given 
self- interacting model. In our context, these mappings are not necessarily 
contractive. Nevertheless, we shall see in Section 6 that the semigroup asso- 
ciated with these mappings becomes essentially constant after a sufficiently 
large number of iterations. In this respect, the self-interacting models dis- 
cussed in the present article are more regular than the ones analyzed in 
[6, 7]. 

The uniform convergence estimate with respect to the number of levels 
depends on the stability properties of a time averaged semigroup associated 
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with the mappings <!>;. The contraction properties of this new class of non- 
linear semigroups are studied in Section 7 in the context of Feynman-Kac 
models. We show that the stability properties of the reference Feynman-Kac 
semigroups can be transferred to study the associated time averaged models. 
In more general situations this question remains open. 

2. Motivating applications. 

2.1. Feynman-Kac models. The main example of mappings consid- 
ered here are the Feynman-Kac transformations given below: 

V/ > 0, V(/i, /) e (V(S®) x B(S^)) 

(2.1) 

(/*)(/) :=/i(G^ +1 (/))/MG,), 

where Gi is a positive potential function on S^' , and L/ + i stands for a 
Markov transition from into S^ l+1 ' . In this situation, the solution of 
the measure- valued equation (1.1) is given by the normalized Feynman-Kac 
distribution flow described below: 

vr (/) (/) = 7 (0 (/)/7 (/) (l) with 7 «(/):=E(/(Y0 ]J G k {Y k ) 

^ 0<k<l 

where (Yi)i>o stands for a Markov chain taking values in the state spaces 
(S^)i>o, with initial distribution ir^ and Markov transitions (L;);>i. These 
probabilistic models arise in a very wide variety of applications including 
nonlinear filtering and rare event analysis as well the spectral analysis of 
Schroedinger type operators and directed polymer analysis [5]. We also un- 
derline that the unnormalized measures 7^ are expressed in terms of inte- 
grals on path spaces and we recall that can be expressed in terms of the 
sequence of measures (Tr^)o<k<i with the following formulae: 

(2.2) 7 (0( /) = 7r (0 (/ ) Yl 7r^(G k ). 

0<k<l 

To check this assertion, we simply observe that 

7 (0 (/) = vr (Z) (/)x7 (/) (l) 
and we have the key multiplicative formula 

7 (o ( i )=7 a-i)(^_ l) = 7r a-i)(^_ l)X7 a-i)(i) 

(2.3) 

=> 7 w (i)= n 

0<k<l 

Thus the i-MCMC methodology allows us to estimate the normalizing con- 
stants 7^(1) by replacing the measures tt^ by their approximations in 
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(2.3) . These models are quite flexible. For instance, the reference Markov 
chain may represent the paths from the origin up to the current time I of an 
auxiliary chain Y[ taking values in some state spaces E[ with some Markov 
transitions (Lj)z>i and potentials (Gz)z>i; that is, we have 

(2.4) Y r .= (Y£,...,Y l ')eSW:=(E> x...xE>) 
and 

Li(yi-i,<tyi) = $(y' ,...,yi l _ 1 )(d(y'o, ■ ■ ■ ,y' l _ 1 ))L l (y' l _ 1 ,dy / l ), 

(2-5) 

Gi{yi) = Gi{y[). 



2.2. Interacting Markov chain Monte Carlo methods for Feynman-Kac 
models. In the Feynman-Kac context and assuming we are working on 
path spaces (2.4), we can propose the following two i-MCMC algorithms to 
approximate . The first one simply consists of sampling directly = 
(Xp , Xp , . . . , Xp ) from the right-hand side product of the formula (1.4) 
which takes here the following form: 

0<g<p 7 0<g<p2^0<m<p Cr fe-lV Arn J 

where dx p k ^ = dx p °^ x • • • x dx' p . We see that Xp fc ^ is sampled according to 
two separate genetic type mechanisms. First, we randomly select one state 
Xq k ^ at level (k — 1) with a probability proportional to its potential value 
Gk_i(Xq k ^). Second, we randomly evolve from this state according to the 
mutation transition Lf.. This i-MCMC model can be interpreted as a spa- 
tial branching and interacting process. In this interpretation, the kih chain 
tends to duplicate individuals with large potential values, at the expense 
of individuals with low potential values. The selected offspring randomly 
evolve from the state space S^ k ~^ to the state space at the next level. 

For the Feynman-Kac transformations (2.1), we proved in [5] that the 
condition (1.8) ensuring convergence of the algorithm is satisfied with q = 
j3{Li) / Ei-\{G) as soon as the potential functions satisfy the following con- 
dition: 

(G) For any I > Q, the potential functions Gi are bounded above and 
bounded away from zero, so that 

ei (G):=inf^€(0,l). 
x,y Gi(y) 
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We can also propose the following alternative i-MCMC algorithm to ap- 
proximate 71"^ which relies on using a transition kernel M^jp different from 
$>i(n). We introduce the following kernel from S 1 ^ -1 ) into E',: 

(2.6) i?*(04 . . ■,x' l _ 1 ),dx' l ) = Uix'^dx^Gi-M^). 

In this scenario, it is sensible to propose to use for M$ in the i-MCMC 
algorithm the following Markov kernel on the product space S^> indexed by 
the set of measures \x € 7 ? (5'^~ 1 ^) 

MjP(x, dy) = (/x <g> Ki)(dy)(l A n(x,y)) 

(2-7) 

+ (l- J (1 An(x,z))(p® Ki)(dz)j6 x (dy), 

where K\ is a Markov transition from S 1 ^" 1 ) into E[ and for every (u,v) and 
(w,z) £ (S( l -V x E[) 

/ oc n // w \ \ d{Ki{u,-)® Ri{w,-)) 

(2.8) ri{{u,v),{w,z))-.= —— — 

d{Ri(u, ■) <g)Ki(w,-)) 

where we assume that 

Ki(u, •) ® i?/(w, •) <C •) ® i^(w, •)• 

It can be checked that the kernel is nothing but a Metropolis-Hastings 
kernel of proposal distribution ji® K[ and invariant distribution 

We can also easily establish that for any measures (//, u) & V(S^~ 1 ^) 2 



\\M\P -M®\\ < 2| 



A* ~~ v \ 



so that the first condition on the left-hand side of (1.8) is satisfied. Under 
the additional assumption that for any (u,v) G (S"' 1 ' x E^) 

dKi(u,-) 

it follows from [11], Theorem 2.1, that 

/3(M«)<(i_C7 r i) 

from which we conclude that the second condition on the right-hand side of 
(1.8) is met with ni = 1 and bi(ni) = (1 — Cf 1 ). 
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i N 

:7r N : =m2^ 6 xP- 



2.3. Interacting particle and Markov chain Monte Carlo methods. As 
mentioned in the Introduction, in contrast to interacting particle methods 
presented in Section 1.2, we emphasize that the precision parameter n of 
i-MCMC models is not fixed but increases at every time step. There exist 
several ways to combine an interacting particle method with an i-MCMC 
method. 

For instance, suppose we are given a realization of an interacting particle 

algorithm X® = (X^) i< p <n with a precision parameter N. One natural 
way to initialize the i-MCMC model is to start with a collection of ini- 
tial random states X^ sampled according to the iV-particle approximation 
measures 

N 

(0 

1=1 

Another strategy is to use the A^-particle approximation measures ir^) in 
the evolution of the i-MCMC model. In other words, we interpret the series 
of samples xf \ 1 < i < N, as the first N iterations of the i-MCMC model 
at level I. More formally, this strategy simply substitutes the current occu- 
pation measure rj^ ^ of the chain at level (k — 1) in (1.3) by the occupation 
measure rj^' k ^ of the whole sequence of random variables at level (k — 1) 
defined by 

_(JV,fc-l) _ n + 1 (fc-i) , N (fc-i) 
^ N + n + l Vn N+n+l N ' 

The convergence analysis of these two natural combinations of an inter- 
acting particle method and i-MCMC method can be conducted easily using 
the techniques developed in this article. 



3. Time inhomogeneous Markov chains. 



3.1. Description of the models. We consider a collection of Markov tran- 
sitions K r] on some measurable space (E,£) indexed by the set of probability 
measures r] £ V(F) on some possibly different measurable space (F, F). We 
further assume that for any pair of measures (77, //) G V{F) 2 and some integer 
no > we have 

(3.1) H^-if^ll <c\\n-n\\ and b(n ) := sup /3(K™°) < 1. 

We associate with the collection of transitions K„ an .E-valued inhomoge- 
neous random process X n with elementary transitions defined by 



P(X„+i E dx\X , ...,X n ) = {X n ,dx), 
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where \i n is a sequence of possibly random distributions on F that only 
depends on the random sequence (Xq, . . . , X n ). More precisely, fi n is a mea- 
surable random variable with respect to the cr-field generated by the random 
states X p from the origin p = 0, up to the current time horizon p = n. We 
further assume that the variations of the flow \i n are controlled by some 
sequence of random variables e(n) in the sense that 

Vn>0 H^n+l - Mnll ^ e ( n )- 

We let e(n) be the mean variation of the distribution flow (/U p )o<p<n; that 
is, we have 

1 n 

p=0 

For SIMC, we have F = E and the measure \i n coincides with the occupation 
measures of the chain up to the current time n. In this particular situation, 
we have 

1 n 9 
(3.2) (i n = rj n := — — V 5 Xp =^ e(n) < 



This implies that 



n + l^ p v ' ~ n + 2 

p=0 



2 

e{n)<—- log (n + 2). 
n + 1 



Under assumption (3.1), every elementary transition Ku n (x,dy) admits an 
invariant measure 

For sufficiently small variations e(n) of the distribution flow u n , we expect 
that the occupation measures r/ n have the same asymptotic behavior as the 
mean values uJ n (fj,) of the instantaneous invariant measures uj(fi p ) from time 
p = up to the current time p = n. That is, for large values of the time 
horizon n, we have in some sense 

1 - 

(3.3) rj n ~ W n (n) := — — - Vu(/ip). 

n + 1 ^ 

3.2. A resolvent analysis. We recall that assumption (3.1) ensures that 
has a unique invariant measure for any 77 G V(F) 

and the pair of sums given by 

(3.4) a( V ):=J2P(K%)e[l,oo) and - cj(r,)](f) 

n>0 n>0 
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are absolutely convergent for any / £ B(E). The main simplification of these 
conditions comes from the fact that the resolvent operator 

P v :feB(E) -+ P,(/):=X;[^-w(»|)](/)eB(B) 

n>0 

is a well-defined solution of the Poisson equation 

(^-Id)P„ = ( W (7 ? )-Id), 

u(v)Pn = 0. 

The reader should not be misled by the notation P v . In this context, P^ is 
not a Markov transition kernel. We have used the letter P in reference to 
the solution of the Poisson equation. 

Proposition 3.1. For any r\ € V{F), P^ is a bounded integral operator 
on B{E) and we have 

n 



\P v \\/2)y P{P v )<a{ri)< 



Proof. The fact that f3(P v ) < a(n) is readily deduced from the follow- 
ing decomposition: 

P v (f){x) - P„(/)(y) := - K-(f)(y)]. 

n>0 

Indeed, using this decomposition we find that osc(P r? (/)) < ^2 n>0 osc(K™(f)). 
Recalling that osc(K™(f)) < j3(K™)osc(f), we conclude that 



osc(P„(/)) < 



n>0 



osc(/) 



n>0 



In much the same way, we use the fact that 

Pirn*) = e / ra/x*) - 



n>0 



to check that 



ip,(/)ii<E osc (^(/)) 



n>0 



and 



IW)II< 



n>0 



osc(/) 



|P r; ||<2E/3(^)- 



n>0 
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To prove that a(rj) < - — "° nQ , , we use the decomposition 

pn -l n -l 

a( V ) = E E W = EE 0(*^ 1)no4r )- 

n>0 P>ln=(p-l)no P>1 r=Q 

Since we have 

j 9(lTfr- 1 > no+r ) < /3(^ p - 1 ) no )/3(^) < piK^-VpiRr) < p(K%°)(P-V 

we conclude that a(ry) < no X^p>o <^ (-^° ) P = i-p(K n o) ' ^ e enc ^ °^ ^ e P r00 ^ 
of the proposition is now complete. □ 

Proposition 3.2. For any pair of measures (rj,n) £ V(F) 2 , we have 
(3.5) ||w(i7)-w(jt)|| <<5 no (7?,//)||n-/i|| 

and 

\\Pp ~ P v II < afa) [2ca(/i) + o~„ (n, //)] - /x|| 
/or some finite constant 8 no (rj, li) such that 

(3 ' 6) 6 no (r),ti)< 1 _( /3 ( K no ) A p( K no ) y 

Proof. The proof of the first assertion is based on the following decom- 
position: 

- = ^?)(^ no - #2°) + [ufa) - "(vWZ - 

Using the fact that 

\\[u(7,)-U,(ji)]K«>\\ < P(KF)\\ U (T,)-U,(ji)\\ 

we find that 

< 3 - 7 ' H") " "Mil £ rr^A^y H-'W " *?>»■ 

On the other hand, we have 

\\co(n)(K™ -KJ«)\\ < \\K? - K?\\\\u,(r,)\\ = - KJ«\\. 
Using the decomposition 

no— 1 

KV(K —TC\\ 



p=0 

we find that 

wo— 1 

ll^ no " K?\\ < E K( K v - KjKf-W] 

p=0 
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For any < p < uq we have 

\\KP(K v -K^K^-(p+V\\ < ||A^||||A^-K At ||||^-^+ 1 )|| 

< WK^-K^W <c\\r]-fj,\\ 

from which we conclude that 

\\K^ - K™\\ < cn \\r, - /i|| \\lu( V )(K%° - K^)\\ < cn \\ V - M ||. 

The proof of (3.5) is now a direct consequence of (3.7). 

The proof of the second assertion is based on the following decomposition: 

P v -P fl = P^Kr, - KjPr, + [u(n) - wfa)]P„. 

To check this formula, we first use the fact that K^P^ = P^K^ to prove that 

P^Kp - Id) = {Kfj, - Id)P„ = - Id). 

This yields 

P ll {K li -ld)P v = {u)^)-lA)P v . 

Using the Poisson equation and using the fact that P«(l) = we also have 
the decomposition 

P^K V - Id)P„ = P M (w(77) - Id) = -P M . 
Combining these two formulae, we conclude that 

P M (Jf„ - if M )P„ = [P„ - P M ] - - u( V )]P v . 

It follows that 

||P, - PA < \\P„(K V - KJPJ + \\[u(ji) - u(ri)}Pj. 

The term on the right-hand side is easily estimated. Indeed, under our as- 
sumptions we readily find that 

IIK^-^iPjU^^p^iKr?)-^)!! 

< a(ri)\\u(ri) - u(n)\\ < a(rj)S no (i], - n\\. 
On the other hand, we have 

WP^Kr, - KjPnW < ^{P V )\\P^K V - KJ\\ < /5(PJ||P M ||||^ - K„\\ 
from which we conclude that 

\\Pp(K v - K^PJ < 2ca(fi)a( V )\\v ~ HI- 
The end of the proof is now clear. □ 
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3.3. L r -inequalities and concentration analysis. First, we examine some 
of the consequences of the pair of regularity conditions presented in (3.1). 
The second condition ensures that the functions a(rf) and 5 no (rj,fi) intro- 
duced in (3.4) and (3.6) are uniformly bounded; that is, we have 

(3.8) 1 < a(n ) := sup a(rj) < - — ^ — - 

ne-p(F) 1 - °( n o) 

and 

(3.9) d(n ) := sup 5 no (rj,n) < < oo. 

(»?,m)6P(f) 2 1 _ <H n oJ 

We recall that uJ n (n) is defined in (3.3). We are now in a position to state 
and prove the main result of this section. 



Theorem 3.3. For any n > 0, / G B\(E) and r > 1 we have the estimate 

2 



E(|[^-57 B ( M )](/)r) 1/r <e(r) 



1 - b(n ] 



1 



+ cE(e(n) r ) 



r\l/r 



for some finite constant e(r) < oo whose value only depends on the parameter 
r. In addition, for any 5 6 (0, 1) and any time horizon n > 1, the probability 
that 



\[Vn -u n (n)](f)\ 



< 



n 



1 - b{n ) 



2 log (2 [S) 
n + l 



+ (l + c) 



4n 



1 - 6(no) 



e(n) V 



1 



n + l 



is greater than (1 — 5) [where c is the constant introduced in (3.1)]. 



Corollary 3.4. For the SIMC associated with the occupation measure 
distribution flow (3.2), we have for any n > 0, f £ B±(E) and any r > 1 

v^+TE(|[rM - WnfaW)\ r ) 1/r < <r)(l + c) ( jZ^) 

for some finite constant e(r) < oo whose value only depends on the parameter 
r. In addition, for any 5 G (0, 1) and any time horizon n > 1, the probability 
that 



| [ Vn - E7„(m)](/)| < ( 1 _ 2 ^ q) ) ^l^iV^sWS) + 2(1 + c)} 
is greater than (1 — 5). 



INTERACTING MARKOV CHAIN MONTE CARLO METHODS 



19 



Proof of Theorem 3.3. First, we examine some consequences of the 
regularity conditions presented in (3.1) on the resolvent function P n intro- 
duced in (3.4). Using Propositions 3.1 and 3.2 we find the following uniform 
estimates: 

sup ((||PJ/2)V/3(P,))< ? 

and 

(3.10) || P( ,_ P ,||< 3c (_^_) 2 ||„_„||. 

In addition, using Proposition 3.2 again we find that the invariant measure 
mapping oj is uniform Lipschitz in the sense that 

^-^''-l^k) 11 ^-^ 11 - 

For any n > and any function / G Bi(E), we set 

n 

/„,(/) := (n + l)[ Vn - &M](f) = Y,if( X p) ~ 

p=o 

Using the Poisson equation, we have 

[id-^ p )] = (id-/^jfv 

From this formula, we find the decomposition 
[f(X p )-uj(fi p )(f)] 

(3.11) = P, p (f)(X p )-K, p (P, p (f))(X p ) 

= [P^(f)(X p ) - P^WXp+i)] + AMp+i(/) 
with the increments 

AM p+1 (f) := [P, p (f)(X p+1 ) - K^P^mXp)] 
of the martingale M n+ \{f) defined by 

n+l n+l 
M n+l (f) :=£ AM P (/) = ^[P^C/Opg - K^P^ifMX^)]. 
p=i p=i 

For n = 0, we set Mo(/) = 0. The first term in the right-hand side of (3.11) 
can also be rewritten in the following form: 

P, p (f)( X p)- p » P (f)( X P+i) 

= [P, p (f)(X p )-P, p+1 (f)(X p+1 )} 

+ [P flp+1 (f)(X p+l )-P llp (f)(X p+1 )}. 
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This yields the decomposition 

n 

£[P^(/)(X p )-P^(/)(X p+1 )] 

p=0 

= [P M0 (/)(X )-P^ +1 (/)(X„ +1 )] +L n+1 (f) 
with the random sequence 

n 

L n+1 (f) := ]T[P Mp+1 ~ P, P ](f)(X p+ i). 
p=0 

In summary, we have established the following decomposition: 

I n (f)=M n+1 (f) + L n+1 (f) + [P M (/)(X ) - P Mn+1 (/)(X n+1 )]. 
We estimate each term separately. First, using (3.10) we prove that 

|P M0 (/)(X ) - P^ +1 (/)(X n+1 )| < ||P M0 || + ||P Atn+1 || < 1 _ 4 ^ ( ° no) . 

In much the same way, using (3.10) we obtain 

n , \2n 

\\L n +l\\ <^||P Mp+1 -P^W < 3c ( 1 _ 1 U n ) ) DK+l-M 

= 3c(n+1> (T^k))^' !) - 

From these two estimates, we conclude that 
(3.12) |J n (/)| < |M n+1 (/)| + 3c(n + 1) ( - n ° \(n) + 



l-6(n )y l-6(n )' 
To estimate the martingale term, we recall that the unpredictable quadratic 
variation process [M(/),M(/)] n of the martingale M n (f) is the cumulated 
sum of the square of its increments from the origin up to the current time; 
that is, we have 

n 

[M(/),M(/)] n :=^(AM p (/)) 2 . 
P =i 

The main simplification of our regularity conditions comes from the fact 
that the increments \AM p (f)\ are uniformly bounded. More precisely, we 
have the almost sure estimates 

|AM P+1 (/)| = \P, p (f)(X p+1 )-K^ p (f))(X p )\ 



j[P^ p (f)(X p+1 ) - P^ p (f)(x)]K^ p (X p ,dx) 
< j \P» p U){X p+1 )-P, v U){x)\K, v {X p ,dx) 
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from which we conclude that 

|AM P+1 (/)| < osc(P Mp (/)) < /3(P„ p ) < l _ re 6 ° M - 

By definition of the quadratic variation process [M(/), M(/)] n , this implies 
that 



. 2 

n 



n. 



The end of the proof is now a direct consequence of the Burkholder-Davis- 
Gundy inequality for martingales. For any r > 1, there exists some finite 
constant e(r) whose value only depends on r, and such that for any n 

E( max |M p (/)r) Vr < e(r)E([A/(/),M(/)];/ 2 )^ < e(r) "° 
Combining this estimate with (3.12), we find that 

E(|/n(/)D 1/r < e(r) ( 1 _ n 6 ° (no) )'[v / (^+l) + ^ + W(nD 1/r ] 

with again some finite constant e(r) whose values may vary from line to line, 
but only depends on r. Recalling the definition of I n (f), we conclude that 

E(| [ Vn - u n (Mf)\ r ) l/r < <r) ( ?, ) 2 [ J_ + cE(e(nY) 1/r 

\L-b{n )J Lv(n+1) 

This ends the proof of the first assertion. To prove the concentration esti- 
mates, we use the fact that 



, r _ , s Un , , |M n+ i(/)| n 
\[r] n - < — — + 



n+1 1 — b(no) 
from which we deduce the rather crude upper bound 
\[yn-w n (jj,)](f)\ 



3cn _, , , 4 
e(n) + 



1 — b(no) n+1 



(3.13) 



M n+1 (/)| 2n 



< 1 "7 +(i + c) 



n + 1 \ 1 — 6(no 



e(n) V 



n+1 



The Chernov-Hoeffding exponential inequality states that for every martin- 
gale M n with Mq = and uniformly bounded increments sup n |AAf n | < a, 
we have 

P(|M„| >tn)<2e- nt2/2a \ 
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In our context, we have proved that sup n |AM n (/)| < no/(l — b(no)), from 
which we conclude that 



P( \[r ]n -u>MKf)\>t + (l + c) 



2n \ 
l-b(n )J 



1 



e(n) V 



n + 1 



We conclude the proof of the theorem by choosing t = pjr^ y ^fjjj^ • D 

4. Distribution flows models. In this section, we have collected the defi- 
nition of a series of semigroups on distribution flow spaces. We also take the 
opportunity to describe some of their regularity properties we shall use in 
the further developments of the article. 

We equip the sets of distribution flows V(S^) N with the uniform total 
variation distance defined by 

V(r?, / u) E (P(S®) ) 2 \\r) - fi\\ : = sup||?? n - fi n \\. 

n>0 

We extend a given integral operator fi S V(S^) 1— > fiL £ V(S^ l+1 ^) into a 
mapping 

rWn>oeP(5 (,) f ^ r ] L = (r 1n L) n > er(S^f. 

Sometimes, we slightly abuse the notation and we denote by v instead of 
(^)n>o the constant distribution flow equal to a given measure v € V(S^). 

4.1. Time averaged semigroups. We associate with the mappings <!>/ in- 
troduced in (1.1) the mappings 

*« : ^ eW G-i))N & l \r,) = (<f>«\r))) n > er(SUf 

defined by the coordinate mappings 

V?? e V(SV~Vf, Vn > $£>fa) := *,(»*»)■ 

We denote by 

$(k,t) = $(fc) o $(k-i,Q 

with < I < k, the semigroup associated with the mappings We also 
consider the time averaged transformations 



INTERACTING MARKOV CHAIN MONTE CARLO METHODS 



23 



defined by the coordinate mappings 

1 n 

\/ v G V( S ^f, Vn > $ n %) := — £ $«(t/) 



p=o 



1 n 



n + _ 

p=0 

For Z = 0, we use the convention $()(%,) = tt^ for any < p < n, so that with 

some abusive but obvious notation $ °^ (r?) = ir^ represents the constant 

sequence (7r(°)) n >o such that tt^ =tt^°\ 

We also denote : V(S^) N -)• -p(5( fc )) N with < I < k, the semi- 

group associated with the mappings <J>0 and defined by 

:= p iM ... p, 
We use the convention <£( fc >') = Id, the identity operator, for I > k. 

4.2. Integral operators. We associate with the kernel from B(S^) 

into fi(S , ( fc - 1 )) introduced in (1.7) the kernel T {k) from (N x B(S^)) into 
the set (N x B(S {k -^)) defined by 

n k \(n,f),d(p,g)) ■.= E(n,dp)xT k (f,dg) 

^ 1 n 

with E(n,dp) := — — ^^(dp). 

n 9 =o 

The semigroup r^ 2 '' 1 ^ (0 < l\ < Z2) associated with the integral operators 
is defined by 

f(h,h) .- p(i 2 )p('2-l) . . . p(ii) 

For h = h = 0, we use the convention r^ ' ) = r^ ) = for the null measure 
on (N x B(S^)). Also observe that 

f(h,h) = ^h-h+l x p Wij 

where the semigroups S' 1 and r^^, < Zi < I2 associated with the pair of 
integral operators £ and T[ are 

E ii =EE ii-i = E ii-i E and r hM :=T h T h - 1 ---T h . 

We use the convention S° = Id. 

We end this section with a technical lemma relating the regularity proper- 
ties (1.7) of the mappings &k to the regularity properties of the semigroups 
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Lemma 4.1. For any < l\ < I2, n>0, any flow of measures 
7>(S0i-i))N and any f unction f e B{S^) we have 

\\$t h) (v)-$k' k) mf)\ 



< / \i Vp -v P ](g)\r^ h H(nj),d(p,g)). 

Proof. Notice that we have = . We also observe that r^ 2,/l ) is 
a kernel from (N x B(S^)) into (N x B n (S^ 1 ~ 1 " > )). We prove the lemma by 
induction on the parameter k = I2 — h- The result is clearly true for k = 0. 
Indeed, by (1.7) we find that for any I > 

_ _ 1 n 

| [4° (V) - (M)l (/) I < — T E I Pi (»*>) - *i(Mp)] (/) I 

n 



1 r 

<— T E / lfe-/as)|r(/,d 5 ). 

n + 1 ^JBiSi 1 - 1 )) 



Rewritten in terms of , we have proved that 

I ffi> (77)-$® (>)](/) I < / I - Mp] G?) |r (0 ((n, /) , d(p, 5)) • 

This ends the proof of the result for k = 0. Now, suppose we have proved 
that 

|[^ 2 ^(r/)-^ 2 ^( M )]( 5 )|< J \[ Vq - N ](h)\T^M((p,g),d( q ,h)) 

for any pair of integers l\ < I2 with I2 — h = k for some k > 1. In this case, 
for any / < and any function / £ we have 

\M +1 > l - k \v)-M +1 > l - k H»W)\ 

= |[¥( ;+1 )(¥W- fc )(r ? )) -4+ 1 )(¥W- fc )( Ai ))](/)| 

and therefore 

|^+ 1 ^ fe )(r ? )-$W- fc )( M )](/)| 

< I |[$w-fc) W _$(M-fe) (/ , )](5) |ra+i)((n,/),d(p, 5 )). 

Under our induction hypothesis, this implies that 

\M +i ' i - k \v) -n +i > i - k) (vw)\ 



< I \[v q -l* q ](h)\ J ^ l+1 \(n,f),d(p,g))T^ k \(p,g),d(Q,h)) 
\[ Vq - N ](h)\T^- k \(n,f),d(q 3 h)). 
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Letting h = (I — k) and I2 = (I + 1), we have proved that for any l\ < I2 with 
l 2 -h = (k + l) 

\m 2 ' h) (v)-M 2 ' b) mf)\ < J \[Vp-^](9)\r {hM ((n,f),d(p,g)). 
This ends the proof of the lemma. □ 

4.3. Path space semigroups. To simplify the presentation, we fix a time 
horizon m > 1 and write u instead of oj ( m ) , the invariant measure mapping 

defined in (1.9). We also write E instead of E m . 

We extend the mapping oj on V(E) to "P(F) N by setting 

0J:r i = (r ]n ) n > eV(Ef ^ oj(r ] ) = (uj n (r ] )) n > eV(E) N 

with the coordinate mappings u n defined by 

u n (r,) := oj( Vn ) = vrW ® $i(r?(°)) ® • • • ® $ m (r,( m ^). 

For every I <m, we recall that 77^ stands for the image measure on 5(0 of 
a given measure 7/ n G V{E m ). We also consider the mappings 

u:r,€P(Ef ^ uJ( V ) = (uJ n (v))n>0 € V(E) N 

defined by the coordinate mappings 

Vr/ = (r/ n ) n > GP(^) N ,Vn>0 

Y n 1 n 

p=0 p=0 



Lemma 4.2. For any 1 < < m and any flow of measures r\ G V(E) , 
we have 

m—k 

oj k (rj) =W [k ~ 1] ® (g) $(*+*-*+!) foW). 

For = m + 1, we Ziaue 

Vr/GP(F) N 6J m+1 (77)=vrH. 



Proof. We use a simple induction on the parameter k. The result is 
clearly true for k = 1. Suppose we have proved the result at some rank k. In 
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this case we have 

m—k 

i=l 

m— k 

= 7f[ fe - 1 ]®7r( fc )®(g)$ i+M (^- 1 )) 
i=l 



7T 



[*] 



m-(fc+l) 



*i+(fc+i),i+i(»/ (i) ; 



j=0 



This ends the proof of the lemma. □ 

Lemma 4.3. For any 1 < k < m and any r\ = (r) n ) n >o £ V(E) N , we have 

m—k 



w lk-l] 



$ (i +fe )^(i+(fc-l),i+l)^(i)> 



i=0 



P=0 L 

For k = m + 1, we have 

VrieV{E) N uf n+1 (r t ) = irW. 

Proof. We use a simple induction on the parameter k. The result is 
clearly true for k = 1. Indeed, we have in this case 



U), 



1 n 

n+ 1 ^ 

p=0 



m— 1 



8=0 



We also observe that 
1 



uJn(r?) (i) 



n + 



p=0 



Suppose we have proved the result at some rank k. In this case, we have 

m—k 



7T 



[k] 



i=l 



from which we conclude that 



p=0 



m-(k+l) 



$ (i+(fc+l)) ^(i+k,i+l) ^(i) ) j 



i=0 



This ends the proof of the lemma. □ 
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5. Asymptotic analysis. 

5.1. Introduction. This section is concerned with the asymptotic behav- 
ior of i-MCMC models as the time index n tends to infinity. 

The strong law of large numbers is discussed in Section 5.2. We present 
nonasymptotic L r -inequalities that allow us to quantify the convergence of 

the occupation measures 7?i = tttt Yln = ri$ v (k) of i-MCMC models toward 

the solution 7rW of the measure-valued equation (1.1). 

Section 5.3 is concerned with uniform convergence results with respect 
to the level index k. We examine this important question in terms of the 
stability properties of the time averaged semigroups introduced in Section 
4.1. We present nonasymptotic L r -inequalities for a series of i-MCMC models 
that do not depend on the number of levels. These estimates are probably 
the most important in practice since they allow us to quantify the running 
time of a i-MCMC to achieve a given precision independently of the time 
horizon of the limiting measure- valued equation (1.1). 

Our approach is based on an original combination of nonlinear semigroup 
techniques with the asymptotic analysis of time inhomogeneous Markov 
chains developed in Section 3. The following technical lemma presents a 
more or less well-known generalized Minkowski integral inequality which 
will be used in our proofs. 

Lemma 5.1 (Generalized Minkowski integral inequality). For any pair of 
bounded positive measures [i\ and \i2 on some measurable spaces (Ei,£i) and 
(E 2 ,£ 2 ) > an y bounded measurable function (p on the product space (E\ x E 2 ) 
any p>l, we have 

i/p 

Hi(dxi) I (p(xi,x 2 )fj, 2 (dx 2 ) 
UEi Je 2 

< [ ( f \(p{xi,X2)\ P Vi{dxi)\ [J, 2 (dx 2 ). 
Je 2 xje! J 

Proof. Without loss of generality, we suppose that ip is a nonnegative 
function. For p = l, the lemma is a direct consequence of Fubini's theorem. 
Let us assume that p > 1, and let p' be such that ^7 + ^ = 1- First, we notice 
that the functions 

l/p 

<£i(xi):= / ip(xx,x 2 )n 2 (dx 2 ) and (j) p (x 2 ) := ( / |^(xi, x 2 )\ p m{dxi) 
'E 2 \J e 1 

are measurable for every p > 1. In this notation, we need to prove that 
A t i( < /'i) 1//p — A*2(0p)- It is also convenient to consider the function 

ip(xi,x 2 ) :=ip(x 1 ,x 2 )/(j)p(x2) 1/p '' 
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We use the convention ip(xi,Xz) = 0, for every x\ 6 E\ as long as </> p (x 2 ) = 0. 
We observe that 

^(xi,x 2 )Vl(^i) =^ P (x 2 )/0 p (x 2 ) 1 / p =^(x 2 ) 1 / p . 



By construction, we have 



9?l(xi)= / 1p(xi,X 2 )(j)p(x2) 1/p ' fJ,2(dx 2 ) 



^2 



< 



1p(xi,X 2 ) P H2(dX2) 



1/p 



X/i 2 (0 P ) VP ' 



from which we conclude that 

WK)<^2(^) P/P 'x 



■0(^1,^2)^1(^1)^2(^2) 



i'2 



= M2 (^'x M2 (^) = M2 (0 p f. 
The end of the proof is now clear. □ 

5.2. Strong law of large numbers. This section is mainly concerned with 
the proof of the following L r -inequalities for the occupation measure of an 
i-MCMC model at a given level. 

Theorem 5.2. Under the regularity conditions (1.7) and (1.8), we have 
for any k>0, any function f E Bi(S^) and any n>0 and r > 1 



(5.1) 



V^+l)E(|[i#)-7r«](/)| 



(fc) lm ,rxl/r 



< 



£=0 v ,v Z+l<i<fc 



Proof. We prove the theorem by induction on the parameter k. First, 
we observe that the estimate (5.1) is true for k = 0. Indeed, by Corollary 3.4 
we have that 

y^TT)E(|[^ 0) " * {0) ](f)\ r ) 1/r < e(r)(l + co) ( "° , ' 

for some finite constant e(r) < 00 whose value only depends on the parameter 
r. We further suppose that the estimate (5.1) is true at rank (k — 1). To prove 
that it is also true at rank k, we use the decomposition 

(5.2) fo<*> -ttW] = -^W*" 15 )] + [^% (fe-1) ) - *W(T (fc-1) )]- 
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For every k > 0, given a realization of the chain X^ k ~^ := (Xp k )»>o the 

(k) ~ 

kth level chain Xn behaves as a Markov chain with random Markov tran- 

(k) 

sitions M dependent on the current occupation measure of the chain 

Vn 

at level (k — 1). Therefore, using Corollary 3.4 again we notice that 

>/fc+l)E(|foW -l(*) (|? (*-l))] (/ ) r) l/r< e(r)(1+ X 

VI - b k (n k ) 

for some finite constant e(r) < oo whose values only depends on the param- 
eter r. 

Using the decomposition (5.2) and Lemma 4.1, we obtain 

It^-^K/)! 

<|[^* ) -*? ) (i7 ( *~ 1) )K/)I 

+ / 1 [vi k ~ 1] - K (k - 1] ] (g)\F {k) ((n, f),d(p, g)). 
For every function / £ B\(S^), and any n > 0, k > 0, r > 1, we set 
jW(/):=V^TTE(|[ ?? ( fc )-7r( fc )](/)r) 1/r and /*> := sup sup ■/«(/). 

«>1/:||/II<1 

By the generalized Minkowski integral inequality presented in Lemma 5.1, 
we find that 



4 k \f)<e(r)(l + c k ) 



n k 



1 - h(n k ) 

+ v^+T J 4 k - l \g)-^L=n k \(n,f),d(p,g)). 



Since we have 



(5-3) / _|_E(»,<ig) = ^£— L= 



< 



q=0 v , . - V^+l 



we conclude that 



4 k) (f) < e(r)(l + cfc) ( — ^ ) 2 + 2j( fc - 1 ) sup | || 5 ||r fc (/,^) 



and therefore 

2 



i<i,£e(r)(1+a) (r^)) +i(t ~' ,2A " 
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Under the induction hypothesis, we have 

k-l 



1=0 v ' v u/ l+l<i<k 



and therefore 

j {k) < e(r) 



(1 + Pfc 



1 - h(nk] 



z=o v v y/ m<i<fc 

2 



Z=0 



ni 



i-h{m) 



) II 2A - 

7 i+l<i<fc 



This ends the proof of the theorem. □ 



5.3. A uniform convergence theorem. This section focuses on the be- 
havior of an i-MCMC model associated with a large number of levels. We 
establish an uniform convergence theorem under the assumption that the 
time averaged semigroup introduced in Section 4.1 is exponentially 

stable; that is, there exist some positive constants Ai,A2 > and an integer 
fco such that for every / > 0, Tj,fM£ V(S^)® and any k > ko we have 

(5.4) pa+M+i)^) _ $0+^+1)^)11 < \ ie ~ X2k . 

We also assume that the parameters (bk, Cfc, n k , A&) are chosen so that 

2-i 



(5.5) ,4 = sup 

fe>0 



:i+cfc) 



nk v 

1 - b k {n k )J 



< oo and i? := 2supAfc < oo. 

fc>i 



For the Feynman-Kac transformations (2.1), we give in Section 7 sufficient 
conditions on Gi and Li + \ ensuring (5.4) is satisfied. If (5.4) and (5.5) are 
both satisfied, we have the following uniform convergence result: 

Theorem 5.3. If B = 1, then we have for any r >1, any parameter n 
such that (ra + 1) > e 2Aa <*°+i), and for any (/,),>„ G \[ 1>Q Osci(SW) 



snpE(\[^-^U)\ r ) 1/r < 



l>0 



e(r) 



A 1 + 



log (n + 1) 
2X2 



+ Aie Aa 



If B > 1, i/ten we have for any r > 1, any n 
and for any (fi)i> £ riz>o Osci(S w ). 



su P E(|[7/(f)-7r«](/,)| r ) 1/r <e(r) 



AB 
B-l 



= A2 



( n + l)a/2 



INTERACTING MARKOV CHAIN MONTE CARLO METHODS 31 
with a := (^JogB) ■ 

Proof. First, we notice that we have the following estimate from (5.1) 
and (5.5) for any k > 0: 

B k+i _ l 



(5.6) V^+l)E(|[^ fc )-7r( fc )](/ fc )| r ) 1/r <e(r)A- 



B-l 



For B = 1, we use the convention B B _^ = k. 
We have the following decomposition: 

rjV+k) _ „■(!+*) = Ul+k) _ $(J+M+i)n^(0)l 

+ [^l+M+iJ^CO) _ $ n ('+W+i)( w (0)] 

(5 - 7) 

= [$g+*.*+i)( f7 (t))_^(l+fc,i+i)^(i)^(t-i)^] 
t=i+i 

+ [$G+M+i) (r? (0) _ $(J+M+i) (7r (0)]. 
Recall that we use the convention $( ll > l2 > = Id for l\ <h, so that 

i = l + k l£+M+l)^(0) = lW+ fe+2 V^ 
Using Lemma 4.1, we find that 

^^+ 1 )(r ? ^))-¥ n ^^)($^)(r/^- 1 )))](/ Z2 )| 



< 



|# ) -^ l) (^ 1 - 1) )](ff)|r (Z2 ' Zl+1) ((n,/ j2 ),% )5 )). 



By the generalized Minkowski integral inequality, this implies that 

E(|^A+ 1 )(^0)_l^A+l) (5 ai) (l7 (Ii-l) ))](A|) |r ) l/r 

< | E(|[^)-¥^)(^- i ))]( 5 )ry/Tfe^ +i )((n,/ i2 ),d( P , 5 )). 

Using Corollary 3.4, we find that 

E (|[$(/2,ii+l) ( ^) ) _^ 2 A+l) ( ^i) ( ^-l) )) ] (/; J|r ) l/r 



<e(r)(l + Qi ; 



1 -bi 1 (n h ] 



J_ sfe-^)(n,dp)x / ||^||r Ia , ll+1 (/ la ,d(7). 
{0,...,n} v(p+i) y 

By (5.3) and 

[ T k>l {f h ,dg)\\9\\ <H,l\\fh\\ withA M < J] < < oo, 

^ Ki<k 
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we conclude that 

vT^E(i[^ 2,Zi+1 H^ ( ' i) )-^ 2, ' i+1) (^ i) (^ ( ^ 1) ))](/i 2 )r) 1/r 

(5.8) 

<e(r)AB l ^\\f l2 \\. 

Using the decomposition (5.7), we prove that for every fi +k £ Bi(S ( - l+k ' > ) and 
any k>ko 

BttpE(|toCH-*) - n^](f l+k )\ r ) 1/r < e(r) 1 1 IT ^ + 
z>o \m + 1 -° ~ 1 



Finally, by (5.6), we conclude that for every k>ko 

supE(| fog) - vrW](/0r) 1/r < e(r) . 1 5 ^ = 1 + 
i>o Vn + 1 B - 1 



For 5 = 1, we have 



i>o V n + 1 



+ Aie 



-A 2 fc 



In this situation, we choose the parameters k,n such that 

log (n + 1) 



/c = fc(n) := 



2A 2 



Notice that fc(n) is the largest integer k satisfying 

1 



k< l ° g t + 1) * 
2X2 

log(n+l) 



Vn+ 1 



< e 



Since (&(n) + 1) > lu& 2 ^ lj , we have 

g -A 2 fc(n) < e A 2e -A 2 (log(n+l))/(2A 2 ) 



-A 2 A; 



0A2 



from which we conclude that 
(fc(n) + 1) 



A- 



+ Aie -A 2 fc(n) < 



A 1 + 



Vn + l 
log (n + 1) 



For B > 1, we choose the parameters fe,n such that 

k-k(n)- l0g(n + 1) 
fc " fe(nj -[2(A 2 + lo gj B) 

Notice that k(n) is the largest integer k such that 



2A 2 



>k n . 



+ Aie 



A 2 
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Since (k(n) + 1) > 2 ^SogB) ■ we have 

r>k(n) „\2 
& <e -\ 2 k(n) <e A2 e -A2(log(n+l))/(2(A 2 +logB)) 



with a := n^-j-iog _B) ' f rom which we conclude that 

yrt +1-D — 1 [±> — 1 

This ends the proof of the theorem. □ 



(n+l) Q / 2 5-1-v/n+T 



6. Path space models. In the previous section, we have established L r - 
mean error bounds and exponential estimates quantifying the convergence 
of the occupation measures 77^ toward the solutions nn of the measure- 
valued equation (1.1). We show here that it is also possible to establish such 
results to quantify the convergence of the path-space occupation measures 
introduced in (1.6) toward the tensor product measure defined in 
(1.10). 

6.1. L, r -mean error bounds. Our main result is the following theorem: 

Theorem 6.1. For every f £ B{E m ), we have 

sup^E(|[f?H-7f( m )](/)D 1/r <oo. 

n>l 

Proof. To simplify the presentation, we fix a time horizon m > 1 and 
write uj instead of uj ( m ) , the invariant measure mapping defined in (1.9). We 

r i 

also write E instead of E m , and rj n instead of 7/ ™ ■ In this notation, (jft 1 ') rep- 
resents the sequence of occupation measures rjn '■= Sp=o G ^(S^) 

of the i-MCMC model on the Zth. level space 

Using the fact that uJ m+1 (77) = , we obtain the following decomposition 
for any rj G T(E) N 

m 

(6.1) ?? _^M = ^[^ (7?) _^+i (??)] . 

k=0 

In the above-displayed formula, 7f[ m l = (7f\^) n eN £ V(E)^ stands for the 
constant sequence of measures Wn = , for any n G N. 

Using Proposition 4.3, the kth iterate uj k of the mapping uj can be rewrit- 
ten for any i] 6 V(E) N in the following form: 

= ^ttE^ -11 8I < w ((" (i, )o<kJ' 

p=0 
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Here the mappings 

/ m \ N 

0<i<m \i=jfc / 

are defined for any n > by 

m—k m—k 

i=0 i=0 
with for any (p^)o<i<m G IIo<t<m? J ('S , ®) N and an y 0<i<m-k 

i4*."»).«(( M (0) J ) := $ i+fc (i( i i +( fe - 1 )' i + 1 )( At W)) e p(5( i+fc )). 

We emphasize that (/x) only depends on the flow of measures (//^)o<j<m-fc) 

and 

= :r^E^ 1 ®4* 1,ro W )i)] 

p=0 

1 n f m— (fe+1) 

— y 

4- 1 



n + 1 

p=0 



1 - 



Tfl*- 1 !®^® (g) ^ +fe+ i(^ +M+2) ($ (m) (r ? W ))) 

i=0 

m— A: 

vf [ ^ 1] (g) 

i=0 



with the convention $(°)(t/ = tt^°\ for i = 0. This implies that for any 
< k < m 

1 n 
p=0 

and therefore 

^n(^)-^n +1 (??) 

1 ™ _ 

( 6 - 2 ) = ttti E^ [fc " 1] ® {nf m) ((?? (/) W - nJ^CC*®^)),)}]. 

71 T 1 

p=0 

Moving one step further, we introduce the decomposition 

i#' m V) -n( fc ' m V) 



(6.3) 
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m-k ( /j-l \ 

j=0 I \i=0 / 

[ n (fc,m.),(j)^ _ n (fc,m),(j)^ 

(m—k \ 1 

n (fe,m) 5 (i) (//) j j 

for any /x = (Ai (,) )o<i<»n and i/ = (i/W) <j< m £ IIo<i<m W (i) ) N > with the 
flow of signed measures 

= [*i+*(^" +(fc - 1)J+1) (M w )) - *i + »(*? +(fc - lw+1) (^ ) ))]. 

For every we find that 

|[nM'WM-n( fc - m )-W(,)](/)l 

(6.4) < yi[($i i+(fc-i),i+i) (/" (i) )) 

-(lO+^-DJ+D^)))]^)!^^/,^). 

We let J-"™'' 7 be the sigma field given by 

J^J = : < p < n, < Z < m, Z / j). 

Combining the generalized Minkowski integral inequality presented in Lemma 
5.1 with the inequality (5.8), we prove that 

E(|[n( fc '™)<w ( ^^ 

< j E(|[(¥^+( fe - 1 )^+ 1 )(#)) 

_ (¥0-+(^-i),i+i) ( $(i) (w 0-i) )))](5)r |^ ) iA x r j+fc (/,^) 

Notice that the decomposition (6.3) can be rewritten for any / £ <6(ni^=fc S^) 
in the following form: 

[n^M-nM (*,)](/) 

(6.5) 

m— fc 

= ^[n(M,(i) W -i4M 1 0-) M]( 4M l W( /J)!/ )( / )) 

j=0 
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with the integral operators I^ k ' m) ' {j \fi,v) : B([\Zk s(l) ) ^ B ( sU+k) ) g iven 
below 

R^\^v)(f){x k+j ) 

fipki • • • j %k+(j — 1)) %k+ji Xk+j+li • • • ) x m) 



X 



fj-\ \ / m-k \ 

(dx l+k ) x ; U^\^(dx i+k ) 



\i=0 / \i=jr'+l / 

Using the fact that the pair of measures 

j—l m—k 

gn^l-Wf^'^f 1 ')),) and (g) Ut^Kiv^i) 

i=0 i=j+l 

only depend on the distribution flow (cj>(*) (?7^~ 1 ^ ))o<i<j— 1 and (r]^)j + i<i< m -. k , 
we find that the random functions 

f (k,mm := R^rnm^Q)^ ($(0 (^"D)),) (/) G £(S0'+ fc )) 

do not depend on the distribution flows rj^ and rj^~ l \ This shows that 
j{k-,rn),{j) are measura bi e -with respect to J-™^ . From previous calculations 
(and again using the generalized Minkowski integral inequality presented in 
Lemma 5.1) we find that 

E(|[n^)-W((^ 

-($^( fe - 1 )^'+ 1 )(¥0-)(^- 1 ))))]( 5 )|HJr J ) 1/r 
<-^= AB k \\f\\. 

We conclude that for any / £ S(rifc<j<m S^) 

E(|[4 fc ^)((r7«) z )-n^)((¥«(^- 1 )))J](/)r) 1A 

<(m-fc + l)-^i=^ fc ||/||. 
V n + 1 

Using (6.5), it is now easily checked that for every / € B(E) 

E(|[i5*(^-cajj+ 1 (^)](/)| r ) 1/r < (m-k + l)-^L=AB k 
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Finally, by (6.1) we conclude that 

( \ m 

This ends the proof of the theorem. □ 

6.2. Concentration analysis. This section is mainly concerned with ex- 
ponential bounds for the deviations of the occupation measures fjn around 
the limiting tensor product measure 7f t m l . We restrict our attention to mod- 
els satisfying the Lipschitz type condition (1.7) for some kernel with 
uniformly finite support 

sup Card(Supp(r fe (/, •))) < oo. 

To simplify the presentation, we fix a parameter m > 1, and sometimes we 
write r/ n instead of fjn • We shall also use the letters Cj , i > 1 to denote some 
finite constants whose values may vary from line to line but do not depend 
on the time parameter n. 

The main result of this section is the following concentration theorem: 



Theorem 6.2. There exists a finite constant o~ m < oo such that for any 
f G Bi(E m ) and t > 

hmsup-logP(|[r/H -5fH](/)| > t) < 

The proof of this theorem is based on two technical lemmas. 

Lemma 6.3. We let M = (M n ) n >i be a random process such that the 
following exponential inequality is satisfied for some positive constants a,b> 
and for any i > and n>l 

P(|M„| >ty/n) <ae~ bt \ 

We consider the collection of random processes = {Mn^) n >i defined 
for any n > and k>0 by the following formula: 

Mj*k:=(n + 1) J E k (n,d P )^-jM p+1 , 

where S fc is the semigroup associated to the operator £ defined in (4-1)- F° r 
every k>0, n > 1, and t > we have the exponential inequalities: 

F(\M^\>t^)<an k e- bt2 / 22 \ 
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Proof. We prove the lemma by induction on the parameter k. For 
k = 0, we have := M n+ \ so that the exponential estimate holds true 

with a(0) = a and 6(0) = b. Suppose we have proved the result at rank k. 
Using the fact that 

M { ^ ] = {n + l) f X k+1 (n,dp)^-M p+1 
J P + l 

= (n + l) / E(Mp)-]-((p + l) / Z k (p,dq)-^—M q - 
J J 9 + 1 

we prove the recursion formula 

On the other hand, we have 

i 7tf (fc+1) 1 r i M (fc) 

1 " + r^/ E i»'*»^r^ 



2 v^T 



and 



^ /■ \ \ " i 

1 ^ f p+1 1 , 



Under the induction hypothesis, we have for any < p < n 

PdM^J > tv/p+T) < a(n + l) fc e- w2 / 22fc . 
This implies that 



1 M (fe+1) 



■J \ 

= >tj < P(30 < p < n : MjJJi > VP + X ) 



,2^+ 

< a(n+ l)( n + l)fc e -^ 2 /2 2fc 

from which we conclude that 

This ends the proof of the lemma. □ 



Lemma 6.4. For every l\ <li, there exists some nonincreasing function 
N:te[0,oo) H- iV(t)G[0,oo) 
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such that for every n > N(t) and any function f £ Bi(S^) we have 

<(c 1 (n + l)) (Z2 - il) e X p(-c 2 t 2 /4 2 " il )- 

Before getting into the details of the proof of this lemma, it is interesting 
to mention a direct consequence of the above exponential estimates. First, 
we observe that N(ty/n + 1) < N(t) so that for any t > and n > N(t) we 
have 

F(\^ 2 ' h+1 \fj^) -^ 2 '' l) (l (Zl) (?? ( ' 1 " 1) ))](/)| >t) 
< (d(n + l))^-' 1 ) exp(-c 2 (n + l)t 2 /c%~ h ). 
Using the decomposition 

1=0 

we prove the following inclusion of events: 

{\[fj^-^](f)\>t} 

C {30 < I < k: |[^ M+1) (r? W ) - $„( fc . i + 1 )($(0( 7/ (i-i) ))](/) | > t /(k + 1)}. 

By Lemma 6.4 we can find a sufficiently large integer N(t) that may depend 
on the parameter k and such that for every n> N(t) 

n\iv i * ) -* (k) ](f)\>t) 



< 

0<l<k 

\ k p -{n+l)t 2 c 2 /{(k+lf4) 



< (fc + l)(ci(n + l))V 
This clearly implies the existence of some finite constant < oo such that 

limsup-logP(|[f?W -vr( fc )](/)| > t) < 

n— ¥oo ri £0~ i. 



Proof of Lemma 6.4. Using Lemma 4.1, we find that 

I [M 2 ' ll+1) ) - ^ 2,/l) (® ih) ))] (/) I 



Arguing as in (3.13), we find that for any g £ B(S^ 11 ^), we have 
(6.6) ISM -IW'W"- 1 ')]^ < ^If 1 + ^ !;; ^NI 
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with a sub-Gaussian process Mn 1 \g) satisfying the following exponential 
inequality for any t > and any time parameter n > 1 : 

F(\Mt\g)\ > ty/n) < 2exp (- C2 t 2 /||ff|| 2 ). 

We notice that 

1 A (log (p + 2)) k ^ (log (n + 2)) fc 



(log (p + 2)f < (log (n + 2)) fe 1 

(log(n + 2)) fc A [P +2 1 



n + 2^ p + 2 n + 2 z -'p + 2 

p=0 p=0 



< 



fe+1 



_ (log (n + 2)) 
n + 2 

This implies that 

log (p + 2) flog (n + 2)) 2 



n + 2 

More generally for any k > 0, we have that 

log (p + 2) .(log(n + 2)) l+1 



/E*(„,*)l2i^<2' 



n + 2 

from which we prove that 

log{p + 2 \\g\\T^ h+1 H(n,f),d(P,g)) 



P + 2 

< 2 fe-' ^"i7';- - / ibiir Wl+l( /,^) 



(6.7) 



■II) ( 1Q g 


(n + 2))^ 






n + 2 




■ii)0°g 


(n + 2))^- 


-2i)+l 




n + 2 




h) (!og 


(n + 2))fe- 




n + 2 



h<i<h 



For any geBiS^) we set 

^(^-/s^-^^dp)^^. 

Using Lemma 6.3, we prove that 

F(M { ^[ 2 \g) >t)< 2(n + exp(-c 2 (r* + l)t 2 /[2 2( - l ^\\g\\ 2 }). 
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We observe that 

/ ^\M^\(g)\T^+ 1 \(n,f),d(p,g)) = J M^\g)F hM+1 (f,dg). 
In addition, using (6.6) and (6.7) we find that 

|[*2 3 ' ,1+1) (^ (ll) )-*n (,9 ' il) (* (ll) (»7 (ll " 1) ))](/)l 

(6.8) 

< / 

{9Wl 2 ,h+x(fidg) +£i lt i 2 (n) 

with 

^ (B)! _^)0Si&i±S^. 
Using the inclusion of events 

{jMlXi\9)T hA+ i(f,dg)>t} 

C {3g E Supp(r Wl+1 (/,•)) such that M^[ 2 \g) > t\\g\\/(A hth+1 )} 



we find that 



J M i £[ 2 \g)T h , h+1 (f,dg)>t 



< S^ifMM^ig) > *||«7||/(A Wl+1 )). 
Finally, under our assumptions we have 

Si 2 ,h+x(f) = Card(Supp(r /2i/l+1 (/, •))) 

< H sup Card(Supp(r fc (/,-)))<cJ 2 "' l) 

h+l<k<h /eB(S (fc) ) 

from which we check that 

M<n 1 + l 1 2) (g)r l2 , h+1 (f,dg)>^ 

<( C5 (n+l)) (Z2 "' l) exp(- C6 ( ? i + l)t 2 /c? 2 " /l) )- 
Using (6.8), we conclude that 

< (c 5 (n + l))('"- l exp(-c 6 (n + l)i 2 /c^ 2 ) . 
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To take the final step, we observe that 

<pn[$(ja.'i+ 1 )(^ I i))-^wi)(5(ii)(^(i 1 -i)))]^)| 

> r '—-r + £ h,h( n ) )• 

We also notice that for any £ > we can find some nonincreasing function 
N(t) such that 

Vn>iV(£) v/n + l£- Jl)j2 (n) <£. 
This implies that for any n>N(t) we have 

P(v^^|[$l' 2 ' Zl+1) (r/ (/l) ) -^ 2 ' /l) (^ l) (¥ il " 1) ))](/)l > 2£) 
< (c 5 (n + 1))^ expC-ce^/c? 2 "' )- 
The end of the proof is now straightforward. □ 

We are now in position to prove Theorem 6.2. 

Proof of Theorem 6.2. We use the same notation as we used in the 
proof of Theorem 6.1. Using (6.4) we find that 

\[ni k > m ^\ri-nt m) ^\v)](f)\>t 

=> 3g G Supp(r i+fe (/, •)) :\[(^ k ~^ j+1 H^)) 

-(^ k - 1 ^ +1 \u^))](g)\>t\\g\\/A j+k . 

Therefore, using Lemma 6.4 we can find a nonincreasing function N(t) (that 
may depend on the parameter k), such that for every n > N(t) and any 
/e£i(S^ +fe )) we have 

p( V^TT\ [n(f W ( M ) - n<M0,c>) („)] (/) i > £) 

<(c 1 (n+l)) (fc - 1) exp(-c 2 £ 2 /4 fc - 1) ). 

In much the same way, by the decomposition (6.5) we find the following 
assertion: 

| [n^^-n^H] (/)|>£ 

30 <j < (m - k) : \[Tlt m ^\n) - lT^^H] 

x{Rg> m Mfav)(f))\>t/(m-k + l). 
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Since R { n k ' m) ' (j) maps BiQl™^ ) in t° Bi(S u+k) ) we have for every 
parameter n > N(t) 

p(v / ^+T|[n^ m )((^)) i ) -n( fc ' m )((¥«(^- 1 ))) i )](/)| >t) 

< (m - fc + l)(d(n + l))'"" 1 exp(-c 2 t 2 /((m - * + 1) 2 <£ -1 ))- 

In summary, we have proved that there exists some nonincreasing function 
N(t) that may depend on the parameter m such that for any < k < m, 
any / S Bi(E), and any n > N(t) we have 

<(c4(n + l)) m exp(- C5 t 2 /^). 

Let (C/ n ) n >x be a collection of [0, 1] -valued random variables such that for 
any t there exists some nonincreasing function N(t), so that for n > N{t) 

P(VnU n >t)< an a e~ t2b 

for some integer a > 1 and some pair of positive constants (a, 6). In this 
situation, we can find a nonincreasing function iV'(t) and a pair of positive 
constants (a',b') such that 



Vn > N'(t)F U p > V^J < a'n a+1 e- t2b ' 

To prove this claim, we simply use the fact that for any n > N(t) we have 

This yields that for any n > iV(i) 

\ V p=l V / p=N(t) 

We let N'(t) be the smallest integer n such that N(t)/y/n< t. Recalling 
that N(t) is a nondecreasing function, we find that for any s > t 

N(t)/y/n<t => N{8)/y/n<N(t)/y/n<t<8 iV(s)/ v / ^<s. 

This implies that N'(s) < N'(t). Thus, we have constructed a nonincreasing 
function N'(t) such that for any n > N'(t) 



"U n >2t I < an Q+1 e-* b / 4 . 
P =i 
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This ends the proof of the assertion with (a',b') = (a,6/2 4 ). Applying this 
property to the decomposition (6.2), we can find a nonincreasing function 
N(t) such that for any n > N(t) and any < k < m 

F(V^\[uJ k n (v)-^ +1 mf)\>t) < (c 7 (n + l)r +1 exp(-c 8 t 2 /^). 

The end of the proof of the theorem is now a direct consequence of the 
decomposition (6.1). □ 

7. Feynman Kac semigroups. In Section 5.3, we established a uniform 
convergence theorem under the assumption that the time averaged semi- 
group <3>( fc '0 introduced in Section 4.1 is exponentially stable; that is, it 
satisfies (5.4). In this section, we study the mappings associated with 

the Feynman-Kac transformations discussed in (7.2). We provide necessary 
conditions ensuring that (5.4) is satisfied in this case. 

7.1. Description of the models. To precisely describe these mappings we 
need a few definitions. 

Definition 7. 1 . We denote by 'Fp the Boltzman-Gibbs transformation 
associated with a positive potential function G on , and defined for any 
/ € B(S^) by the following formula: 

*?(Vp)(f) = V P (Gf)/Vp(G). 

We let Qi be the integral operator from B(S^) into B(S^ l ~ 1 ') given by 

(7.1) V/gB(5«) Qif/j^GMxIil/jeB^- 1 '). 

By definition of the mappings &i given in (2.1), it is easy to check that 
$«( 7? ) = *(0,QKi)( r? )^ 

with Vn > 0*£ ),Q ' (1) (t/) = —J— Y^? l{1 \v P )- 

p=0 

Definition 7.2. We let be the semigroup associated with the 

Feynman-Kac transformations discussed in (7.2), and we denote by 

Qi,k = QiQi+i ■ ■ - Qk 
the semigroup associated with the integral operator Qi introduced in (7.1). 
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Proposition 7.3. For any I <k we have that 
(7.3) m i \r}) =^ k ' l Hv)Pi,k wtth P hk (f) = 

and the mapping 

y(k,t) from V{S^) N into itself given below: 

= ym,H l , ko y(i),H l ,k-x ... y(i),H l , l withH QiM 



H.k 



Qi,k-x(l) 

For I = k, we use the conventions = vj/(' _1 >0 = Id and Q^fc_i(l) = 

Qi,,_i(l) = l, sothatH l j = Q l , l (l) = Q l (l) andW =*(0,<J»(i). ' 

Proof. We prove the proposition by induction on the parameter m = 
(k — I). For k = 1, we clearly have 

p " (/)= li =ii(/) 

and 

^(M) = ^(0,Qi(i) $W( 7? ) = *W)(r ? )P M . 

Suppose we have proved formula (7.3) for some m = (k — I) > 0. To check 
the result at level m + l = (k — Z) + 1 = ((/c + 1) — Z), we first observe that 

l(fe+i)($(*=,0( ?? )) ="^( fc+1 )^+i( 1 )(l( fc ' / )( r? ))p fc+lifc+1 . 
For any fieV(S^), we also have that 

^(fc+i),Q fc+1 (i)/ w p rn x_ 1 ^ /Xp(Q fc+ i(/)) 
\NK^k+iU )) ~ — "~r / , — 77; 7TTT 

so that 

^(fe+i),Q fe+ i(i)^(fc,/), ^ p _ 1 sr^ ®p' l \v)(Qk+i{f)) 

Using the induction hypothesis, we find that 

¥^) (r?)( Q fc+l(/))= ^,0 (r?)[ p /ifc( Q fc+l(/))] . 

We also have 

Pl,k(Qk+i(f)) = ^ fc+ ix Pl,k+l(f) = Hik+iPik+i(f) 
from which we prove that 

^■°(i7)[fl,*(0^i(/))]=^(»7)[^i.fc+ifl,* +1 (/)]. 
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This clearly yields that 

$p ,l \v){Qk+i(f)) _ ^ l \y)[H Uk+l P^ k+l {f)\ 
*?'°fa)(Q fc+ i(l)) " ¥W( v )[H hk+1 ] 

= *f' fc + 1 (*W) (7?)) p w(/) 

and therefore 



1 n 

= WS?' H, -* +1 (^(*.0(^))R fc+1 (/). 



In summary, we have proved that 

& k+1 ' l Hv) = y (k+1 > l) (v)Pi,k+i(f) 

with *( fe + 1 .0( r? ) = ^j?'* 1 -^ 1 (*(*-') fo)). 
This ends the proof of the proposition. □ 

7.2. Contraction inequalities. 

Proposition 7.4. For any I <k we have 

^(P z>fe ) = isup||¥W)(7 ? )-¥^( M )||. 

Proof. Using Proposition 7.3, we find that 

\\^ k ' l \r]) -$^)(»|| = ||[* (fe,I) (77) -^ ik ' l \fi)}P^ k \\ 

<m,k)\\* {k ' l) (r))-* {k ' l \ril 

This implies that 

sup||¥^)(r ? )-¥W)(^)||<2/3(P^). 

On the other hand, if we chose the constant Dirac distribution flows r] = 
(Vn)n>0 and /x = ( / u n ) n > given by 

Vn > 7? n = 5 X and /x n = 5 y 
for some x,y £ (S*^ -1 ), we also have that 

This implies that 

supper?) -l^V) II >sap\\6 x Pt tk - d y P l)k \\=2/3(P ljk ). 

r],fi x,y 
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This ends the proof of the proposition. □ 

Our next objective is to estimate the contraction coefficient (3{Pi±) in 
terms of the mixing type properties of the semigroup = L\Li_-y ■ ■ ■ 
associated with the Markov operators L/. We introduce the following regu- 
larity conditions. 

(L) m There exists an integer m > 1 and a sequence (e;(L))/>o E (0, 1) N 
such that 

VI > 0,V(x,y) G (S^) 2 L l+ltl+m (x,-) > ei (L)L l+ltl+m (y,-). 

It is well known that the above condition is satisfied for any aperiodic and 
irreducible Markov chain on a finite space. Loosely speaking, for noncompact 
spaces this condition is related to the tails of the transition distributions on 
the boundaries of the state space. For instance, let us assume that S® = K 
and L[ is the bi-Laplace transition given by 

L l (x,dy) = ^-e-^\y- A ^dy 

for some c(7) > and some drift function A n with bounded oscillations 
osc(Ai) < oo. In this case, it is readily checked that condition (L) m holds 
true for m = 1 with the parameter 

£i-i(L) =exp(-c(/)osc(^)). 

Under the condition (G) presented on page 11 and the mixing condition 
(L) m stated above, we proved in [5] (see Corollary 4.3.3 on page 141) that 
we have for any k > m > 1, and I > 1 

LJfe/mJ-l 

m+l,l+k)< II ^-4+1) withef m) :=^ (L ) Yl £k{G) . 

i=0 l+l<k<l+m 

Several contraction inequalities can be deduced from these estimates, we 
refer to Chapter 4 of the book [5]. To give a flavor of these results, we 
further assume that {M) m is satisfied with m = 1 and e{L) = infy £i(L) > 0. 
In this case, we can check that 

m+H+k)<{l-e{L) 2 ) k . 
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